Add optional MLflow logging to the cross-validation CLI by gbeane · Pull Request #407 · KumarLabJax/JABS-behavior-classifier

gbeane · 2026-06-24T00:36:48Z

Summary

Adds opt-in MLflow tracking to the jabs-cli cross-validation command. Each run can record aggregate cross-validation metrics, run parameters, descriptive tags, and the training report as an artifact, so cross-validation runs of a behavior can be compared over time. MLflow is a fully optional dependency — the base install and all existing behavior are unchanged when it isn't used.

Tracks KLAUS-444.

What's included

New module jabs.classifier.mlflow_logging

log_cross_validation_to_mlflow(...) — creates one MLflow run, logs metrics/params/tags + the report artifact, returns (run_id, tracking_uri).
Helpers: aggregate_cv_metrics, build_params, build_tags, resolve_experiment_name, parse_kv_tags, load_env_file, mlflow_available, and MlflowLoggingError.
import mlflow is lazy (inside the logging function only), so the base package never depends on it.

Optional dependency — new mlflow extra: pip install 'jabs-behavior-classifier[mlflow]'.

CLI options on cross-validation

--mlflow [ENV_FILE] — enable logging; optional .env file with MLFLOW_* connection settings (ambient env if omitted).
--mlflow-experiment NAME — override the experiment (see below).
--mlflow-tag KEY=VALUE — repeatable free-form run tags.
--mlflow-no-report — skip the report artifact (metrics + params only).

Per-behavior experiments — runs default to experiment jabs-<behavior> so a behavior's runs form their own leaderboard (mixing behaviors isn't comparable). Precedence: --mlflow-experiment → MLFLOW_EXPERIMENT_NAME → jabs-<behavior>. The experiment is auto-created.

Leaderboard metrics — cv_f1_behavior_mean, cv_accuracy_mean, precision/recall (mean + std), iteration count, and dataset composition are logged as MLflow metrics, so the experiment's runs table is sortable by mean F1. Full per-fold detail rides along as the report artifact.

Exit codes & failure handling

Logging runs after results are printed and the report is saved.
--mlflow requested but the mlflow extra not installed → fail fast with an error before running cross-validation, exit 1. Logging was explicitly requested but can't be honored, so the command stops rather than silently producing a run with no logging; install the extra (or drop --mlflow) and re-run.
--mlflow ENV_FILE path doesn't exist → fail fast before running cross-validation, exit 1. The env-file path is validated up front (with a leading ~ expanded), so a typo is caught immediately rather than after the run.
Push fails (server/auth/TLS) → results/report preserved, warn, exit 3 (distinct from the generic 1).

Docs — both copies (online + in-app cli-tools.md) gain a jabs-cli cross-validation command section and a detailed MLflow integration section (install, enabling, connection config, experiment selection, leaderboard, tags, exit codes).

Example

jabs-cli cross-validation /path/to/project --behavior grooming \
    --mlflow settings.env --mlflow-tag purpose=baseline
# -> logs to experiment "jabs-grooming"

Testing

New tests/classifier/test_mlflow_logging.py (logging module, with a fake mlflow injected — no server/network) and MLflow CLI option-parsing tests in tests/scripts/test_cross_validation_cli.py.
Full tests/classifier/ + tests/scripts/: 305 passed. ruff check/format clean.

Copilot

Pull request overview

Adds opt-in MLflow tracking to the jabs-cli cross-validation workflow so cross-validation runs can be logged (metrics/params/tags + optional report artifact) and compared over time, while keeping MLflow as a fully optional dependency.

Changes:

Introduces jabs.classifier.mlflow_logging with helpers to aggregate CV metrics, parse tags, load MLFLOW_* env files, resolve experiment names, and push a single MLflow run per invocation.
Extends the cross-validation CLI with --mlflow [ENV_FILE], --mlflow-experiment, --mlflow-tag, and --mlflow-no-report, plus a distinct exit code (3) for MLflow push failures.
Adds unit tests for the logging module (with a fake injected mlflow) and CLI option parsing; updates docs in both the online and in-app copies.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
uv.lock	Adds `mlflow` as an optional extra in the lock metadata.
pyproject.toml	Declares the `mlflow` optional dependency extra.
src/jabs/classifier/init.py	Re-exports MLflow helpers and error type from the classifier package.
src/jabs/classifier/mlflow_logging.py	New module implementing MLflow availability checks, env loading, aggregation, tagging, and run/artifact logging.
src/jabs/scripts/cli/cli.py	Adds MLflow-related CLI options and wiring into `run_cross_validation`, including exit code mapping.
src/jabs/scripts/cli/cross_validation.py	Adds MLflow logging after report save, and raises `MlflowLoggingError` on push failure.
tests/classifier/test_mlflow_logging.py	New tests for metrics aggregation, tag parsing, env file loading, experiment selection, and logging behavior via fake MLflow.
tests/scripts/test_cross_validation_cli.py	New tests for MLflow option parsing/forwarding and exit code mapping.
docs/user-guide/cli-tools.md	Documents the cross-validation command and MLflow integration (online docs).
src/jabs/resources/docs/user_guide/cli-tools.md	Mirrors the same CLI + MLflow documentation for the in-app docs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ror wrapping

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.

gbeane · 2026-06-24T23:58:35Z

Resolved in 7c651d8 (docstring) and the PR description has been updated.

cross_validation.py docstring now states the CLI fails fast with an error before running when --mlflow is requested without the mlflow extra installed (instead of "warns").
PR description's exit-codes section updated: --mlflow without the extra → exit 1 (fail-fast, before CV runs); runtime push failure still → exit 3.

Implementation, tests, docs (both copies), docstring, and PR description are now consistent.

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.

gbeane · 2026-06-25T00:34:23Z

Both Copilot review comments are resolved in f8dfadd.

cli.py env-file validation: the --mlflow ENV_FILE path is now expanded (~ resolved) and validated up front; a missing file fails fast with an error (exit 1) before running cross-validation, instead of aborting the push at the end (exit 3).
mlflow_available() docstring: no longer overstates the guarantee — it now says a located spec means the package is installed and discoverable, not that import mlflow is guaranteed to succeed. Also fixed a second stale sentence that still described the old warn-and-skip behavior.

Added test_mlflow_missing_env_file_fails_fast and test_mlflow_env_file_tilde_expanded; documented the new exit-1 case in both cli-tools.md copies and the PR description. tests/classifier/ + tests/scripts/: 305 passed, lint clean. Both review threads marked resolved.

gbeane added 4 commits June 23, 2026 14:04

Add optional MLflow logging to cross-validation CLI

1c7672a

Warn and skip MLflow logging when the mlflow extra is not installed

2e0ec33

Document cross-validation CLI and MLflow integration in user guide

a5e0832

Log cross-validation runs to a per-behavior MLflow experiment

94b4c82

gbeane requested a review from Copilot June 24, 2026 00:38

Copilot started reviewing on behalf of gbeane June 24, 2026 00:39 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Address PR review: scope MLflow option parsing, fix docstring/docs/er…

2de35e4

…ror wrapping

gbeane requested a review from Copilot June 24, 2026 01:12

gbeane self-assigned this Jun 24, 2026

gbeane requested review from bergsalex and keithshep June 24, 2026 01:12

Copilot started reviewing on behalf of gbeane June 24, 2026 01:12 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Comment thread tests/classifier/test_mlflow_logging.py Outdated

Make load_env_file test robust to ambient environment

75f86a6

keithshep approved these changes Jun 24, 2026

View reviewed changes

Comment thread src/jabs/scripts/cli/cli.py Outdated

Fail fast when --mlflow requested but mlflow extra is not installed

3cd1525

gbeane requested a review from Copilot June 24, 2026 19:45

Copilot started reviewing on behalf of gbeane June 24, 2026 19:45 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Comment thread src/jabs/scripts/cli/cross_validation.py Outdated

Comment thread src/jabs/scripts/cli/cli.py

Fix cross_validation docstring to describe fail-fast MLflow behavior

7c651d8

gbeane requested a review from Copilot June 24, 2026 23:59

Copilot started reviewing on behalf of gbeane June 25, 2026 00:00 View session

Copilot AI reviewed Jun 25, 2026

View reviewed changes

Comment thread src/jabs/scripts/cli/cli.py

Comment thread src/jabs/classifier/mlflow_logging.py

Validate --mlflow env file up front and fail fast if missing

f8dfadd

gbeane merged commit e9ec24b into main Jun 25, 2026
5 checks passed

gbeane deleted the feature/cv-cli-mlflow-logging branch June 25, 2026 00:35

Uh oh!

Conversation

gbeane commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Example

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

gbeane commented Jun 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

gbeane commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gbeane commented Jun 24, 2026 •

edited

Loading